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Abstract 

It is common, in both theoretical and experimental studies, to separately discuss 
quark and gluon jets. However, even at parton level, widely-used jet algorithms fail 
to provide an infrared safe way of making this distinction. We examine the origin 
of the problem, and propose a solution in terms of a new 'flavour-/^' algorithm. 
As well as being of conceptual interest this can be a powerful tool when combining 
fixed-order calculations with multi-jet resummations and parton showers. It also has 
applications to studies of heavy-quark jets. 



1 Introduction 



A search through the SPIRES database reveals over 350 articles whose titles contain the 
expressions 'quark jet(s)' or 'gluon jet(s)' PQ. The idea of quark and gluon jets appears so 
intuitive that it hardly seems necessary to examine the question of what it means. Yet, 
when going beyond leading order perturbative QCD, the concept of quark and gluon jets 
is only meaningful once a procedure has been defined to classify an ensemble of partons 
into a set of jets, each with a well-defined flavour — a flavour that is insensitive to the 
addition of extra soft or collinear branchings. To our knowledge the question of how to do 
this in general has not been addressed in the literature. 

As well as being of intrinsic interest, the question of how to define the flavour of a 
partonic jet is becoming of increasing practical importance as the study of QCD is extended 
to multi-jet ensembles (by jets we mean both incoming and outgoing ones): in studies of 
e + e~ — > jets one knows that the basic 2-jet Born configuration consists of quark jets; but 
for jet production at hadron colliders, the Born configuration involves 2 incoming and 2 
outgoing jets and many flavour channels are possible, qq — > qq, qq — > gg, gg — > gg, etc. 
The ability to assign flavours to the jets is especially useful when combining fixed-order 
predictions with all-order calculations (be it for parton showers as in [2] or for analytical 
resummation jHllHE]). This is because all-order calculations are carried out for a fixed 
Born configuration, with a single flavour channel at a time, while fixed-order calculations 
implicitly sum over all flavour channels and can at best be split up a posteriori to match 
onto the individual flavour channels of the all-order calculation. 

As a concrete example, consider the calculation of higher-order corrections to the pro- 
cess qq — > qq, fig. [IJi. An all-order calculation treats the addition of any number of 
soft/collinear gluons and extra qq pairs implicitly, leaving the underlying 2^2 flavours 
unchanged. When trying to supplement this with results of a fixed order calculation one 
encounters the problem that higher-order contributions cannot be uniquely assigned to 
any given 2 — > 2 flavour channel — the O (a s ) corrections to qq — > qq include e.g. a 
qq — > qq — > qqg piece, but a fixed order calculation gives only the squared sum of all 
qq — > qqg diagrams, among them qq — > qq — > qqg and qq — > gg — > qqg, illustrated in 
fig. HJd andHt respectively. There can exist no unambiguous procedure for separating the 
qq — > qqg contribution into its different underlying channels, both because the different 
channels are not individually gauge invariant and because they interfere when squaring the 
amplitude. 

One therefore needs a prescription to assign qq — > qqg either to the qq — > qq or the 
QQ ~~ ¥ 99 underlying Born 2^2 process (or else to declare it irreducibly 2 — > 3 like), it 
only being in the qq — > qq case that one needs to put it together with the qq — > qq all-order 
calculation. This reclassification of a 2 — > 3 event as a 2 — > 2 event is similar conceptually 
to what is done in a normal jet algorithm, except that not only should the momenta of 
the resulting 2^2 configuration be infrared and collinear safe, but so should the flavours. 
Accordingly we call it a jet-flavour algorithm. 

An obvious approach to defining jet flavours at the perturbative level would be to start 
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Figure 1: (a) Specific qq — > qq flavour channel for a 2 — > 2 parton scattering process; (b) 
higher-order diagram that can be seen as a correction to (a); (c) higher-order diagram that 
can be seen as a correction to the process qq — > gg, but with the same final-state partons 
as (b). 

with an existing jet algorithm, such as the /^-clustering jHllZllH] or cone jU] algorithm, that 
defines jets such that each particle belongs to at most one jet. One can then determine 
the net flavour content of each of the jets, as the total number of quarks minus antiquarks 
for each quark flavour. Jets with no net flavour are identified as gluon jets, those with 
(minus) one unit of net flavour are (anti) quark jets, while those with more than one unit 
of flavour (or both a flavour and a different antiflavour) cannot be identified with a single 
QCD parton. 



Figure 2: A large-angle soft gluon splitting to a large-angle soft qq pair (k 3 , fc 4 ) with the 
q and q then clustered into different jets (ki, k 2 ). 

Applied to the k t or cone algorithms, this procedure yields a jet flavour that is infrared 
(IR) safe at (relative) order a s discussed in our example above. However at (relative) order 
a 2 s a large-angle soft gluon can split into a widely separated soft qq pair and the q and 
q may end up being clustered into different jets, 'polluting' the flavour of those jets, see 
fig. El Because this happens for arbitrarily soft gluons branching to quarks, the resulting 
jet flavours are infrared unsafe from order a 2 s onwards. We are not aware of this problem 
having been discussed previously in the literature, though there do exist statements that 
are suggestive of IR safety issues when discussing flavour JTOj . 

In section 12 we shall discuss IR flavour unsafety with respect to the k t (or 'Durham') 
algorithm in e + e~ [Bj. There we shall recall that the k t closeness measure is specifically 
related to the divergences of QCD matrix elements when producing soft and collinear 
gluons. However there are no divergences for the production of soft quarks and, as we shall 




see, it is the use for quarks of a distance measure designed for gluons that leads to the 
infrared unsafety of jet flavour in the k t algorithm. By taking into account the absence 
of a soft-quark divergence when designing the jet-clustering distance measure, one can 
eliminate the infrared divergence of the jet flavour. 

The essence of the modification to the k t distance is that instead of the mm(Ef, E?) 
factor that appears usually, one needs to use max(£' J 2 , E?) when the softer of i,j is a quark. 
In section El we will examine how this can be extended to processes with incoming hadrons. 
There the added difficulty is the need for a particle-beam distance measure. Traditionally 
this involves only one dimensionful scale, related to the squared transverse-momentum k\ 
of the particle. There is a sense in which this can be understood as min(fc^, kf B ), where k\ B 
is some transverse scale associated with the beam that is larger than all k\ and so could 
up to now be ignored. In order to obtain a sensible jet-flavour algorithm we shall however 
need to consider also max(A;^-, k\ B ) and therefore in section El we shall investigate how to 
construct sensible 'beam scales'. 

As well as explaining how to build jet algorithms that provide an infrared safe jet 
flavour, we shall also examine how they fare in practice. In e + e~ it will be possible to 
carry out tests both with an NLO code (which explicitly reveals the IR unsafety of flavour 
in traditional jet algorithms) and with parton-shower Monte Carlo codes. For hadron- 
hadron collisions only parton-shower Monte Carlo tests will be possible because none of the 
currently available NLO codes provides access to the final-state parton flavour information. 



2 The fc r flavour algorithm for e + e 

The aim of clustering algorithms is to recombine particles into jets in a manner that 
approximates the inverse of the nearly probabilistic picture of ordered QCD branching. 
Since, however, the branching itself is a quantum mechanical process, there is no unique 
way of inverting it for a given final ensemble of particles. What can at most be done is 
to design it to work correctly in limits in which the QCD branching behaves classically, 
e.g. when a given particle is emitted as if from a single identifiable parent. The design 
of good jet algorithms is therefore more a craft than a deductive science. Nevertheless 
certain general principles will help us identify how to extend existing jet algorithms to deal 
properly with flavour. 

Let us start by considering the most widespread clustering algorithm, the standard 
e + e _ Durham (or kt) algorithm [6 : 

1. Introduce a distance measure between every pair of partons i, j: 

(D) 2min(gjg) 
Vij = -Q2 ( X - cos 6 W ' ( X ) 

where E$ is the energy of particle i, 9ij is the angle between particles i and j and Q 
is the centre of mass energy. 
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2. Find the specific i and j that correspond to the smallest y^j and recombine them 
according to some recombination scheme (we shall here use the E scheme, which 



sums the four-momenta). 



3. Repeat the procedure until all y\^ > y cut (or, alternatively, until one reaches a 
predetermined number of jets). 



The defining characteristic of such clustering algorithms is the distance measure, because 
it determines the order in which emissions are recombined. 1 It is closely related to the 
divergences of the QCD matrix elements — for a gluon j that is soft and collinear to a 
gluon i the product of phase-space and matrix element for a parent gluon to branch to i 
and j is 

Wl^(*f)l - ^a^A {E . <<; E{ 9i . <<; 1} (2) 

" ^3 °ij 

Thus with the distance measure eq. (fl]). two particles are deemed to be close when either 
of the parameters in which the matrix element has a divergence, Ej = mm(Ei, Ej) or 9ij, 
is small. This is a key characteristic of a good distance measure because where there is a 
strong divergence there will be many splittings that are independent of the 'hard' properties 
of the event — such splittings should be undone (recombined) at the early stages of the 
clustering to leave at the end only well-separated hard pseudo-jets. 

A second key characteristic of a distance measure can be understood by examining the 
Jade algorithm [T2], which is identical to (and predates) the k t algorithm, except that its 
distance measure is 

^ = ^(1 (3) 

Again, for Ej <C Ei, &ij <C 1, the distance y\ J j becomes smaller when either Ej or 6ij is 
reduced, i.e. whenever the matrix-element divergence is made stronger. However it also 
becomes smaller when Ei is reduced, even though a modification of E{ has no effect on the 
divergence of the matrix element in eq. (0). The undesirable consequence of this is that the 
Jade algorithm strongly 'prefers' to recombine pairs of soft particles at large relative angle, 
instead of combining the individual soft particles with any collinear but harder neighbours, 
and so 'pulls' particles out of their natural jet. 

From this brief discussion, one can see that the distance measure should satisfy two main 
characteristics: (a) two particles should be considered close when there is a corresponding 
divergence in their matrix elements; 2 and (b) the measure should not inadvertently intro- 
duce 'spurious' extra closeness for a variation of the momenta that does not lead to any 
extra divergence (see however discussion below eq. (jUJ))- 

1 There exist also jet- algorithms in which the measure that determines the order of recombination differs 
from that defining the stopping point for recombination, e.g. the Cambridge and Aachen algorithms 

2 This discussion is somewhat of an oversimplification — for example the Angular-ordered Durham 
algorithm |1 1 j retains only the angular part of the closeness measure and nonetheless behaves sensibly. 
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For generic hadron-level jet studies the Durham measure eq. (JTj) is a good choice because 
the majority of emissions are gluons — the correct matrix element to consider in the design 
of the measure is that for soft gluon emission (be it from a quark or a gluon) and it always 
has both a soft (energy) and collinear (angular) divergence. For flavour algorithms one 
should remember that the matrix elements for g — > qq or q —>■ qg (with a soft quark) have 
no soft divergence, but just the collinear divergence, 

rji i 771 ACQ 

(note the index % in the energy denominator) and analogously for q — > g^qj. With the y\f^ 
measure, eq. (JTJ), a branching that produces a soft quark, 2£j <C has the same closeness 
as in the case of the gluon — however this closeness is now spurious because, in contrast 
to the gluon-emission case, there is no divergence for Ej — > 0. The replacement of the 
Ej denominator in the gluon-emission case, eq. (|5]L with Ei in the 'soft-quark' emission 
case, eq. (HJ), suggests that the closeness measure for soft g — > qq branching should become 
2 ma.x(E 2 , E 2 )/Q 2 (l — cosfly). A similar argument holds in the case of q — > giqj with 
Ej <C Ei. Thus we should use a distance measure that depends on the flavours of the 
particles being considered: 

(i?) _ 2(1 — cos%) J max(E-, £"|) , softer of i,j is flavoured, , . 

^ Q 2 \ min(.E 2 , E- 2 ) , softer of i, j is flavourless, 

where the softer of i,j is the one with the smaller energy and where we use the terms 
flavoured and flavourless rather than quark-like and gluon-like so as to allow also for sit- 
uations with diquarks or other multi-flavoured objects. With eq. (0) soft-quark 'emission' 
leads to no smaller a distance measure than non-soft quark emission, in accord with the 
absence of a soft divergence for quark emission. Furthermore if a quark is to recombine 
with a harder particle it will favour one that is not too hard, in accord with the presence 
of max(i?i, Ej) in the denominator of eq. (JH), which implies that the harder the parent, the 
less likely it is that it will produce a quark of a given softness. 

With such a distance measure, for configurations as in figure El the soft q and q will 
have similar energies, E 3 ~ E A Q. Thus yi 3 ~ ~ 1/23 ~ 2/24 ~ 1, whereas y 34 ~ 
E\jQ 2 1. So independently of the precise (large) angles of the soft qq pair, 3 and 4, 
it is that soft pair that will recombine first to give a gluon-like pseudo-jet g. This will 
have yi g ~ y<i g ~ E\jQ 2 and now the soft gluon pseudo-jet will recombine with either 
1 or 2 (which one depends on the angles) and the net flavour of the hard particles will 
remain unchanged. Therefore, at order a 2 , our new measure correctly eliminates the soft 
flavour- changing divergence that exists for the plain Durham algorithm. 

Sometimes in the above algorithm a quark can be recombined with another quark or 
with an antiquark of a different flavour. This can happen for example if there are two large- 
angle qq pairs. As long as the resulting 'doubly-flavoured' object is treated in the same 
way as a quark in the definition of y\, , the algorithm will remain infrared safe, because 
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in the subsequent clustering steps there will be a strong preference for recombining the 
multiply-flavoured object with other objects of similar softness, until all soft large-angle 
multiply-flavoured objects combine between themselves to produce gluon-like objects (these 
then recombine normally with the hard partons). 

One may wish to avoid the appearance of multiply-flavour pseudo-jets altogether, since 
they cannot be associated with QCD partons. This can be achieved by vetoing any recom- 
bination that would lead to a multiply-flavoured object, i.e. by replacing step 2 with 

2. (bland) Find the specific i and j that correspond to the smallest y!-p among those combina- 
tions of i and j whose net flavour corresponds either to an (anti)quark or a gluon, 
and recombine them. 

We call this a 'bland' variant of the jet-flavour algorithm, since 'excessively flavoured' 
clusterings are forbidden. We note that a blandness requirement on clusterings has been 
discussed also in |2 (though a simple 'bland' Durham algorithm with the standard y\p 
remains infrared unsafe). 

An interesting question is that of how much freedom exists in the definition of the 
distance measure for a flavour algorithm. Returning to the analysis of fig. 121 the main 
requirement for infrared flavour safety is that the soft fermions 3 and 4 should recombine 
between themselves before recombining with harder particles. This property is maintained 
for the following class of distance measures, 3 



where a is a continuous parameter in the range < a < 2 (so far we have implicitly 
discussed a = 2). Above, we stated the requirement that the distance measure should not 
introduce 'spurious' extra closeness for a variation of the momenta that does not lead to 
any extra divergence. Here though, for a < 2 such a spurious extra closeness is present. 
Infrared flavour safety is nevertheless preserved, because the extra closeness is weaker than 
that that arises in the case of a divergence, i.e. for a soft gluon j, yij vanishes as E?, 
whereas for a soft quark j it only vanishes as E 2 ~ 2a . 

Naively it would seem that a = 2 should give the best identification of flavour. However 
there are situations where a hard quark loses energy through multiple collinear gluon 
emission and thus becomes a relatively soft quark. In principle there are no large ratios 
between the quark energy and the softest of the harder gluons it has emitted. However if 
that gluon is a bit harder than the quark, a value of a < 2 can make it easier for them to 
recombine. Accordingly below we shall examine both a = 1 and a = 2. 

A rigorous test of jet flavour algorithms can be obtained by numerically investigating 
the infrared safety of the jet flavour in fixed-order calculations. For example, one generates 
events e + e~ — > qq together with higher orders and clusters them to two jets. With a jet 

3 We consider only those that reduce to the Durham algorithm for purely gluonic ensembles of particles. 
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Figure 3: NLO differential cross section for e + e~ — > qq events that after jet clustering have 
their flavour badly identified, i.e. identified as consisting of two gluon jets (that is, each of 
zero net flavour) or two jets each of net flavour larger than 1; the coefficient of (a s /27r) 2 , as 
generated with Event2 [T^j, is plotted as a function of the Durham y 3 three-jet resolution 
threshold; results are shown for the Durham and flavour algorithms (for two values of a). 



algorithm that provides a good reconstruction of the flavour, one expects that each of the 
two jets should have net flavour corresponding to an (anti)quark. Sometimes this does not 
happen — for example each of the two jets may have no net flavour, i.e. be gluon-like. This 
is legitimate in events in which there has been a hard branching (there is not a unique 
clustering to two jets), but for an infrared safe flavour jet algorithm, the probability of this 
happening should vanish in the limit in which there are only soft and collinear emissions. 

To measure the hardness of a given event we use y^, the threshold value of the Durham 
jet-resolution below which the event is clustered to three jets of more. 4 Figure El shows the 
differential cross section at next-to-leading order (NLO, order a 2 s ) for producing events in 
which the flavour of the two jets is badly identified. It has been obtained with Event2 [T5] . 
to our knowledge the only NLO code that provides information on the flavour of the final- 
state partons. 5 One sees that for the Durham algorithm the differential cross section for 
events whose jet flavour does not corresponds to qq goes to a constant as lnyjf goes to 
— oo. This is the sign of the infrared unsafety of flavour identification in the Durham jet 
algorithm. In contrast, in our flavour algorithms (for both values of a) the corresponding 

4 Any other global event-shape like variable that measures the departure from two jets could equally 
well have been used — the only requirement is that for consistency in comparing the flavour behaviour of 
different jet algorithms one always use a common measure for determining the hardness of the event. 

5 In the default version of Event2 there were subtraction terms that had contributions from final states 
with different flavours — for our studies here we split those subtraction terms so that each one corresponded 
to a unique set of final-state flavours. 
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cross section vanishes for hiy® — > — oo. Detailed examination of the events with badly 
identified flavour at small y® reveals that one of the (anti) quarks has lost nearly all of 
its energy to a hard splitting and goes into the same hemisphere as the other quark, i.e. 
identification of the event as consisting of two gluon jets is actually legitimate. Such 
configurations appear at order a s where their cross section is da^ / d\ny 3 ~ a s yfy^ . At 
NLO, Sudakov suppression of an extra soft gluon leads to a contribution 

da^ a s ( Ca Cf \ , 2 d da\ ad 2 A" di 2 d m\ 
+ — In 2/3-— a s Jyghiy£, (7) 



dlny^ 2n \ 2 4 / dlny^ 

which is found to be consistent with the observed numerical results, thus confirming the 
interpretation given above for the origin of the small fraction of gg-like events at small y®. 

Given that one of the possible applications of jet flavour algorithms is in the merging 
of matrix-element and parton-shower calculations, we also wish to examine how flavour 
algorithms behave for Monte-Carlo generated parton-level ensembles of quarks and gluons. 
This is interesting for various other reasons too: Monte Carlo generators produce multiple 
soft and collinear gluon emissions and g — > qq splittings, so they are more likely to 'stress- 
test' a jet flavour algorithm; also we can study a much wider variety of processes with them 
- for example one can simulate a fake e + e" — > gg to examine jet flavour algorithms in a 
simple gluonic context; one can also easily use them for studies of hadron-hadron events 
(next section) where currently none of the NLO programs gives direct access to information 
on the flavour of the outgoing partons. 

While Monte Carlo event generators provide considerable flexibility, it can be difficult to 
interpret their results. For example infrared unsafety of the flavour in fixed-order programs 
manifests itself as a non- vanishing probability of misidentification of the flavour as y% — > 0. 
With an event generator one is instead likely to see this probability vanishing with an 
anomalous dimension, e.g. {y^) cas where c is some coefficient (assuming, for the purpose 
of the discussion, fixed coupling). 

For an infrared safe jet flavour one expects that for the clustered jets to have a different 
flavour from the Born channel there should have been a hard branching, as in the discussion 
above for the NLO e + e _ calculation. This would lead to flavour misidentification vanishing 
as (y^) where d is some pure number (above, d = 1/2). This too may however be modified 
by an anomalous dimension, becoming for example {y^) d+eOLs where e is some further pure 
number. 6 

In the presence of anomalous dimensions it is difficult to establish from Monte Carlo 
events exactly which functional form one is seeing. Yet another complication is that Monte 
Carlo event generators often do not contain the full structure of soft large-angle divergences, 
so that in any case the anomalous dimensions observed may not correspond to the true 
ones. 



6 One kind of diagram that leads to flavour misidentification is that in which a hard quark loses most of 
its momentum by repeated gluon emission, and ends up in the opposite jet. This is similar to non-singlet 
small- x quark production in parton distribution functions, known to be enhanced by an all-order double 
logarithmic series |14j . Such a series might also appear in the jet-flavour case, leading to a more complex 
modification of the naive (y§ ) ) d behaviour than stated in the main text. 
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Figure 4: Fraction of events (generated by Herwig 15^ at parton level) whose flavour 
is badly identified by various jet algorithms, shown as a function of the Durham jet 
resolution threshold; a large value of Q has been chosen for illustrational purposes, so 
as to provide a correspondingly large range in y®; the left-hand plot shows results for 
e + e~ — ► qq, while the right-hand plot shows fake "e + e~ — > gg" process as generated by 
Herwig (code=107). 

Despite these complications, for an infrared safe jet flavour algorithm one expects 
flavour misidentification to vanish visibly faster as y® — > than for the infrared unsafe 
case. This signal can be made clearer by going to large Q so as to have access to a large 
range in (note though that a large value of Q also 'stresses' the jet flavour algorithm, 
since it increases the phase space for extra soft qq production). Figure |U (left) shows the 
fraction of events, for each y® value, where the flavour has been misidentified in various 
jet algorithms. It has been generated for Q = 10 4 GeV, using Herwig [T3] (chosen because 
it provides default access also to a fake e + e~ — > gg reaction, code 107). 

One sees clearly different y® dependences for the Durham versus the flavour jet algo- 
rithms, with the flavour jet algorithm misidentification vanishing considerably more rapidly 
(actually as VwF)- Here all the flavour algorithms behave similarly. Note also that the 
bland Durham algorithm works considerably better than the plain Durham algorithm and 
only at very small y® values does one see it doing worse than the flavour algorithms: for 
the bland algorithm to generate a wrong-flavour event there must be a soft qq pair of the 
same flavour as the hard qq, and additionally the directions of the soft qq must be such as 
to lead to jets with net gluon flavour rather than diquark flavour. 

This situation changes in the right-hand plot of figure where we consider fake 
e + e _ — > gg events. Here the bland Durham algorithm behaves almost identically to the 
normal Durham algorithm. This is expected, since a soft qq pair encounters no blandness 
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problems when contaminating the flavour of gluon jets. The flavour algorithms all work 
systematically better than the Durham-based algorithms, clearly vanishing faster with y^. 
One sees differences in normalisation between the different flavour algorithms and the 
blandness requirement provides a non- negligible advantage, especially for a = 2. This 
implies that the flavour misidentification involves more than one qq pair. Nevertheless, the 
algorithm remains infrared safe even for multiple soft or collinear qq pairs, as discussed 
above 7 (see also the appendix for a more general outline of the discussion of IR safety). 



3 Jet-flavour algorithms for hadron-hadron collisions 

For hadron-hadron collisions (and DIS) the k t jet algorithm is similar to that described 
in section with a few modifications in the definition of the distances E] . Given that 
there is no unique hard scale Q, instead of examining dimensionless y^'s one looks at 
dimensionful d^s. These need to be invariant under longitudinal boosts and the most 
widespread convention is to take 

dy = min(4, k%) (At/* + A0|.) , (8) 

where Ar]ij — r]i — r)j, A0y = fa ~ <fij an d k ti , r\i and fa are respectively the transverse 
momentum, rapidity and azimuth of particle i, with respect to the beam. A particle i can 
also recombine with the beam and here too one needs a distance measure, usually taken 
to be 

d iB = k 2 tl . (9) 

It is the smallest of the diB and the dij that determines which recombination takes place. 
If it is diB that is smallest at a given step, then i recombines with the beam (or else gets 
called a jet, in the "inclusive" version of the algorithm). 

The modification of the d^ needed to obtain a flavour-safe jet algorithm is directly 
analogous to that used for the e + e~ algorithm: 

d {F) = (An 2 - + Ad) 2 .) x / max ( fe «' k tj) ' softer of is flavoured > n ) 
* J J u [ min(A;| i , k 2 j) , softer of i,j is flavourless, 

where by 'softer' we now mean that having lower k t and where temporarily, for simplicity, 
we consider only the case a = 2. 

It is less obvious how to modify the beam distance. The problem is that diB involves just 
a single scale, k%, and so there is no "minimum" that one can replace with a "maximum". 
However one could imagine that diB is actually the minimum of k\ and some transverse 
scale associated with the beam, k 2 B , which has never been explicitly needed so far because 

7 Note though that for a fixed degree of softness, the presence of multiple qq pairs, spread densely in 
rapidity from large-angles all the way to the hard-fragmentation region can lead to a systematic worsening 
of the flavour identification. 
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it was always larger than any of the k\. The analogue of eq. (fTUj) would then be to take 

^(F) _ [ max(/cj i , k% B ) , i is flavoured, 
%B \ min(fc^, k\ B ) , % is flavourless. 

The question that remains is how to define ktB- 

A first issue is that we will want to identify the flavour of each of the incoming beams. 
So whereas for the normal kt algorithm one recombines particles with 'the beams', here we 
will need to specify which of the two beams a particle recombines with. Therefore we will 
need to define k tB for the beam moving towards positive rapidities (right) and k tB for the 
other beam. 

In line with the DGLAP idea ^1 of logarithmic ordering, such that harder emissions 
are at successively larger angles with respect to the beam that produced them, it makes 
sense for the beam hardness to be a function of rapidity, k t B(v) ■ m the definition of diB, 
eq. (fTT|) . one would then use k t s(Vi)- For the right-moving (positive rapidity) beam, one 
scale that appears naturally is (with O(0) = 1/2), 

Pt,nMv) = J2 kt ^-V), (12) 

i 

i.e. the beam scale should be at least as hard as all emissions that have already occurred 
from that beam (i.e. all emissions that are at larger rapidity). Another scale that arises is 

PaMv) = Yl k ^ eV ' e (V - Vi) ■ (13) 

i 

When one performs a Sudakov decomposition of all momenta ki = otiP + /3jP + k ti (P = 
(1, 0, 0, 1) and P = (1, 0, 0, —1)), in the massless approximation, this scale is just the sum 
of the o.i = k ti e Vi components of all particles that are still to be emitted by this beam (i.e. 
are at smaller rapidity). It is equivalent to the light-cone momentum still left in the beam. 
This scale depends on the reference frame, but can be transformed into a boost invariant, 
local 'transverse' hardness by multiplying it by e~ v , giving 8 

PtaMv) = Yl k t^~ r '®(v - Vi) ■ (14) 

i 

By adding the two measures, Plight (??) and Pto i i e ft('7) for the beam scale, one obtains an 
overall beam hardness measure, 

kt B (v) = Yl k » ( e fo + " ^)e Vl ~ v ) , (15) 



8 Another way of seeing how this scale arises naturally is to recall that in the non-longitudinally invariant 
version of the kt algorithm for DIS and hadron-hadron collisions ^7], the beam distance is diB = 2-Ef (1 — 
cosOiB)- Replacing Ei with the effective beam energy ^P a ,i e ft (i.e. taking the larger of Ei and the effective 
beam energy) and taking the small-angle limit gives precisely P* a loft . 
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that takes into account both emissions that have already occurred at a certain rapidity (in 
the picture of ordering of emissions) and those that will occur further on. Similarly one 
defines a scale for the other beam 

i 

In the same way that one updates the c% and dis after each clustering, one should update 
also the ktB and k t §. 
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Figure 5: Plot of ktB and k t s for a multi-jet parton-level LHC event, generated by Herwig; 
also shown is the histogram of the rapidity distribution of transverse momenta. 

To illustrate the properties of k tB and k t §, fig. EJshows these two quantities for a typical 
multi-jet LHC event (represented as a histogram of total transverse momentum per bin of 
rapidity). Towards positive rapidities, kts(v) decreases as e~ v , while k tB {rj) approaches a 
constant, so that as is natural, positive-rapidity particles combine with B, while negative 
rapidity particles combine with B. At the point where k t s and k t § cross, they are of the 
same order of magnitude as the total transverse momentum in the event, i.e. its overall 
hardness. Note also that k t B(v) and k tB {rf) are always at least as hard as the hardest 
emission at rapidity 77. 

Let us now summarise the jet flavour algorithm for hadron-hadron collisions: 

1. Introduce a distance measure d\- between every pair of partons i, j: 

d {F ' a) = (A 2 + A</> 2 ) x / max( ^' kt ^ a min (^' k tj) 2 ~ a ' softer of h 3 is flavoured, 
* J JJ JJ min(/c 2 j, fcf ) , softer of i,j is flavourless, 

(17) 
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i is flavoured, 
i is flavourless, 



(18) 



and an analogous definition of d £ involving fey? (77*) instead of k t B{Vi) (both defined 
as in eqs. (fTK|) and (JH3)). 9 As in section El we have introduced a class of measures, 
parametrised by < a < 2. 

2. Identify the smallest of the distance measures. If it is a d\f' a \ recombine i and j; if 



the beam that has the smaller k t B(Vi)i ^wiVi)- 

3. Repeat the procedure until all the distances are larger than some d cut , or, alterna- 
tively, until one reaches a predetermined number of jets. 10 ' 11 

In the 'bland' variant of the algorithm one considers only those dij for which the product 
of the recombination would have at most one flavour. Similarly one considers only a subset 
of the diB — in this case the blandness requirement is imposed on the flavour of the parton 
entering the hard interaction, or equivalently on the difference between the flavour of the 
incoming hadron and the flavour contained in the outgoing beam jet. 

The infrared safety of this algorithm follows from the same arguments that were used in 
the e + e~ context. The beam scales simply ensure that qq pairs that are soft but separated 
by Ar/ 2 + A0 2 > 1 recombine with each other before recombining with the beam. This 
eliminates the potentially dangerous situation that would otherwise occur, in which first 
the q recombines with one beam and then the q recombines with the other beam. Therefore 
it is not just the flavours of the outgoing jets that are infrared and collinear safe, but also 
those of the incoming beam jets (the determination the beam-jet flavours of course also 
requires knowledge of the incoming parton flavours). 

A concrete demonstration of the infrared safety of the hadron-hadron algorithms, anal- 
ogous to figure[H]for e + e~, is not possible with currently available tools, because none of the 

9 The beam distances in eqs. (11511 and 116|l have been constructed by considering situations with just 
massless partons. However, their definition can be extended to cases with massive particles in the final 
state by replacing k t i with y/ k^ + m 2 . Notice that any heavy non-QCD particles should also be included 
in the sums (|15fl and l|16fl . even if they do not enter the clustering. In DIS, in the Brcit frame, ktsiv) 
should include an additional contribution related to the virtual photon, given by Q^^e - * 7 + 0(— 77)), 
while k t jj{rf) should have an additional contribution Q(Q(rj) + Q(—rj)e 11 ), where Q is the photon virtuality. 

10 Yet another possibility is to introduce separate measures for the ordering of recombinations and for 
the point where recombination comes to a stop, as in the Cambridge and Aachen algorithms 

n In light of recent work that relates the kt algorithm to a geometrical nearest neighbour problem 18 
to reduce its computational complexity to JVlniV, it is worth commenting that the simultaneous use 
here of both min(fc^, k 2 j) and max(fcf i , fcf^) invalidates the Lemma of ^B] that was central in making the 
connection with a nearest neighbour problem. It is therefore not clear whether it would be possible to 
write the flavour algorithm such that its complexity goes as NlnN. The implementation that we use has 
a complexity that scales roughly as N 2 . 
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higher-order NLO jet codes 1201 provide direct access to information about final-state 
flavour. Even if they did, there would be an additional complication compared to e + e~. 
In e + e~ at Born level, there is only one flavour channel, i.e. e + e _ — > qq. Therefore one 
could identify flavour infrared unsafety by examining, for example, the 3-jet NLO cross 
section for jets classified as gg. In hadron-hadron collisions all flavour channels are present 
at Born order, therefore to verify the infrared safety of, say, the gg — > gg channel one 
must supplement the NLO 2+3 jet calculation with the gg — > gg Born contribution and 
its two loop corrections, i.e. one must carry out a NNLO 2+2 jet calculation, which is 
beyond today's technology. Fortunately an alternative method exists for verifying the IR 
safety of flavour identification using just a NLO 2+3 jet calculation, namely by examining 
the cross section for doubly-flavoured jets, since these do not appear at Born level, but 
are infrared unsafe in the plain k t algorithm. We hope that flavour information will soon 
become available in 2 + 3 jet NLO codes, making it possible to demonstrate this explicitly. 

In the absence of any way of obtaining a fixed-order illustration of the infrared safety of 
the flavour algorithms, we resort to investigations of reconstruction of the flavour in parton- 
level Monte Carlo events. This is achieved by comparing, event-by-event, the flavours in the 
hard 2 —* 2 partonic scattering with those of the beam and outgoing jets after clustering of 
the event to 2 + 2 jets. Since the normal k t algorithm does not usually distinguish between 
the two beams, we extend it (both normal and bland variants) such that a particle destined 
to recombine with 'the beams' is assigned to that with the smaller of ktB{Vi) an d ^ts(?7i)- 

The proportion of events where the original and reconstructed 2^2 flavours do not 
match is shown in fig. El as a function of yf* = d^ 1 / {E t \ + E t 2) 2 . Here cfg is the threshold 
value of d cut below which the event is clustered to 3 or more jets in the standard exclusive 
longitudinally- invariant k t algorithm [7j; E t \ and E t 2 are the transverse energies of the 
two last jets to be recombined with the beam if there is no d cut [22] (equivalently the two 
hardest jets when running the inclusive kt algorithm [8J). We consider simulated LHC 
events and require the hardest jet to have a transverse energy larger than 1 TeV and the 
two hardest jets to have \r)\ < 1. 

Three representative channels, qq — > qq (including qq — > qq), qq — > gg and qg — »■ qg are 
shown in fig. [H] as obtained with Herwig ^5] • The standard parton showering in Pythia (22] 
gives similar results (with a slightly higher normalisation). We also illustrate the qg — ► qg 
channel using the recently developed transverse momentum ordered shower in Pythia [21] . 
In all cases one sees that the rate of flavour misidentification falls significantly more rapidly 
towards small j/g for the flavour algorithms than for the normal k t algorithm or its bland 
variant. 12 This is similar to what was observed for e + e~ in section |2l and is a sign of the 
infrared safety of the flavour algorithms. 

12 It is interesting to note that the bland k t algorithm sometimes behaves worse than the normal k t 
algorithm (e.g. for qq — > gg). To see why this happens, consider a beam corresponding to incoming u 
flavour, together with a soft collinear uu pair. In the normal k t algorithm, the u and u can recombine with 
the beam in any order. In the bland variant the u is prevented from recombining first (because the parton 
entering the reaction would then implicitly have uu flavour) and if it has the lower k\ it will instead try 
to recombine with the other (wrong) beam. Therefore the bland algorithm actually has an extra source of 
infrared-collinear flavour unsafety relative to the plain k t algorithm. 
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Figure 6: The proportion of Monte Carlo events in which the flavour of one or more incom- 
ing or outgoing reconstructed parton-level jets differs from the flavour in the corresponding 
parton in the original hard event; shown as a function of ln?/|* for three channels (in the 
case of qg — > qg for both Herwig ^5] and a recently developed parton shower algorithm in 
Pythia [21]); LHC kinematics are used and the events selected are those where the hardest 
k t - algorithm jet has a transverse momentum greater than 1 TeV and where the two hardest 
jets have \rj\ < 1. The range of most common values of yf* depends on the subprocess but 
is typically roughly —8 < lnyf* < —3. 
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One notes that for all algorithms the fall-off is less rapid in the hadron-hadron case 
than in e + e~. This is natural given the increased number of jets and therefore of sources of 
radiation which can lead to extra flavour in the final state. Another difference compared to 
e + e~ is that now the a = 1 flavour algorithms sometimes fare better than the a = 2 case. 
This is not systematic and also depends on the Monte Carlo program used to generate 
events (compare figs. andEJl). The overall normalisation of the curves also depends 
on the Monte Carlo program used and one sees that Pythia with transverse-momentum 
ordered showers produces parton-level final states in which it is systematically harder to 
cluster back to the original flavour. 

4 Outlook 

We have shown in this article that it is possible to define parton-level jets in a manner 
that ensures that their flavour is infrared safe. The key ingredient in doing this was a 
modification of the k t distance measure, inspired by the different structures of divergences 
that appear in quark production and gluon production. In the case of hadron-hadron 
collisions it was also necessary to introduce the concept of a hardness associated with the 
beam at any given point in rapidity. Where possible, explicit NLO verifications confirm 
the infrared safety of the new 'flavour' jet algorithms. Parton-level Monte Carlo studies 
also indicate a significant improvement in the identification of flavour relative to the k t 
algorithm. 

To make use of our new algorithms to accurately study jet flavour, it is necessary to have 
access to information about the flavour of final-state partons in NLO jet codes. Currently 
however, most NLO jet codes have been designed assuming that the user has no need for 
information about final-state parton flavour (an exception is Event2 [T3]). In light of the 
developments presented here, we look forward to flavour information being made available 
in the future (e.g. 

Our original motivation for studying the problem of jet flavour was the need to accu- 
rately combine resummed predictions for hadron-collider dijet event shapes 122] with 
corresponding fixed-order predictions [T^ I2Uj. Another simple flavour-related study would 
be the investigation of how the relative fractions of quark and gluon jets at hadron collid- 
ers are modified by NLO corrections and how they vary with jet transverse momentum. 
Apart from its intrinsic interest, such information could be of relevance also to the tuning 
of Monte Carlo event generators and studies of hadron multiplicities in jets, both of which 
are sensitive to the proportions of quark and gluon jets. 

One drawback of the algorithms presented here is that, when considering light flavours, 
they can only be applied to partonic and not hadronic events. This is because at each 
recombination they require knowledge of which objects are flavoured (quark-like) rather 
than flavourless (gluon-like) and that information is not present in hadronic final states.. 
It would be interesting to find a jet algorithm based purely on particle momenta, that 
nevertheless provides a good infrared-safe determination of the flavour at parton level. It 
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is not clear to what extent this is possible. 13 

There is nevertheless one hadron-level context in which this article's flavour algorithms 
could be used directly, that is for heavy-quark jets Currently a heavy-quark jet is 

defined as a jet containing one or more heavy quarks (or heavy-quark hadrons). The 
fraction of jets of transverse energy Et containing a heavy quark of mass itlq is enhanced 
by terms a™ln 2n_1 Ex/mQ for Et ^> rriQ, due to the large multiplicity ~ a™ ln 2n E T /m Q of 
gluons above scale ttlq, combined with the possibility that they split collinearly g — > QQ, 
responsible for a further factor a s In Et/tuq P^l 1271 12%]. Therefore, at high Et the majority 
of so-called heavy-quark jets are not jets induced by a heavy quark, but rather jets in which 
a heavy quark has appeared from the internal branching in the jet. This implies that the 
current definition of heavy-quark jet will lead to large QCD backgrounds in searches for 
new particles which aim to tag an 'intrinsic' heavy quark jet among the decay products of 
the new particle. 

An alternative approach to the study of heavy quark jets would be to consider the net 
heavy flavour of jets, 14 i.e. the number of heavy quark hadrons minus heavy antiquark 
hadrons in a given jet. 15 With the cone or k t algorithms such a definition would elimi- 
nate nearly all the final-state logarithmically enhanced terms, leaving just a™ ln n_1 Et/ttiq 
contributions (involving a final-state BFKL-type resummation (301 EH]). These remaining 
terms come from the same diagrams that led to the infrared unsafety of light flavour of a 
jet. They can therefore be eliminated altogether by applying our flavour jet algorithm with 
the minor modification that every occurrence of "flavour" is to be replaced with "heavy 
flavour" . In this way it becomes possible to give meaning to a concept of intrinsic heavy 
flavour, i.e. heavy flavour that originates exclusively from the heavy-flavour component 
of parton distribution functions, from hard QCD flavour "creation" (e.g. gg — > QQ) and 
from the decay of other heavier particles. We look forward to future phenomenological 
investigation of this concept. 
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Appendix 

The arguments for the infrared safety of our jet flavour algorithms, as discussed in sectional 
applied only to the case of one or two extra soft qq pairs. Here we give an outline of a 
general all-order discussion of infrared and collinear safety of the flavour. It will be framed 
in the context of e + e~ collisions, and then in closing we will briefly mention hadronic 
collisions. 

For a general discussion of the infrared and collinear safety of flavour one needs to 
examine all divergent cases in which flavour is either produced or moved from one part of 
the event to another. Production of flavour arises from gluon splitting. This has just a 
collinear divergence; additionally the gluon itself has soft and collinear divergences with 
respect to other quarks and gluons. Flavour can 'move' during the branching process when 
a quark recoils due to emission of a gluon of similar hardness to the quark. This has no 
divergences, but there may be divergences associated with the original production of the 
quark itself. Flavour can also move during the jet-clustering procedure whenever a quark 
recombines with a parton that is not collinear to it and whose momentum is of the same 
order of magnitude as (or larger than) the quark. 

Let us first consider flavour production by collinear splitting of a gluon. The Durham 
algorithm always recombines collinear particles into the same jet. Since in g — ► qq splitting 
there is no soft divergence, the q and q have commensurate hardnesses. Therefore the 
'flavour' distance measure eq. (0) is of the same order of magnitude as the Durham distance 
measure and so the qq from a collinear splitting of a gluon will end up in the same jet also 
in the flavour algorithm, leaving the jet flavour unchanged as is required for IRC safety of 
the flavour. 

Next we consider non-collinear splitting of a gluon into qq. This has divergences when 
the original gluon is collinear to some other parton and/or soft. If the gluon itself is 
collinear to some other parton a, angle 8 ag <C 1, then the gluon splitting to qq is strongly 
suppressed unless 6 q q ~ 9 ag , i.e. non-collinear splitting is not possible from a gluon that 
is collinear to some other parton. This is the basis of the widely used angular ordering 
approximation. Therefore a qq produced from a collinear (and optionally soft) gluon will 
always recombine, in the flavour algorithm as in the Durham algorithm, ensuring the safety 
of the flavour of any resulting jet. 

This leaves the case discussed already in the main text, in which a large- angle qq pair 
is produced from a large-angle soft gluon. We have already presented the arguments that 
explain the IR unsafety of the Durham algorithm in this case and the IR safety of the 
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flavour algorithms. 

In generalising the analysis to higher orders one needs also to examine potential 'motion' 
of the soft large-angle q and q. It will be useful to introduce the compact notation y\{2... n } 
for the set of distance measures x/i 2 , 2/13, . . . , y\ n - 



Figure 7: Configurations in which flavour 'moves' during branching and clustering, dis- 
cussed in the text with regards to infrared and collinear safety. 

Firstly the quark (or anti-quark) can itself emit a large-angle gluon of similar softness 
(Ate), fig. El (left) . This will change the direction of the quark (A4). In the Durham algo- 
rithm, each of the ?/{i2}{345} is of the order of the soft gluon k 2 /Q 2 , and the recombination 
sequence depends significantly on the angles. In particular the emission of k$ from the 
quark may have moved it further away from the antiquark making it more likely that the 
soft qq end up in different jets. In contrast, in the flavour algorithm ?/{i2}{34} are of order 
1, whereas 2/{i234}5 and j/ 34 are of order of the soft gluon k 2 /Q 2 . Therefore 3, 4 and 5 will 
all recombine together first, or 5 will recombine with the hard jets and then 3 and 4 will 
recombine together. In both cases the flavour of the soft quarks is neutralised. 

The analysis of the right-hand diagram of figure [7| is largely similar as long as k$ is at 
large angles and of the same hardness as k% and k^. The additional issue is that now Ate 
has a collinear divergence with respect to k 2 . One might generally worry that semi-hard 
radiation collinear to k 2 might pull k% far away from its original direction. This could 
happen if Ate is collinear to k 2 and if k$ and k$ recombine, with E$ 3> E% such that the 
recombination product ends up collinear to k 2 . However if E§ 3> E3 then 2/35 3> 2/34 and the 
/C3-/C4 will recombine first, neutralising the flavour. Note that if E§ ~ E3 and k$ is collinear 
to k 2 then the k 2 — k 5 recombination will occur first, leaving the usual (safe) configuration 
consisting of a soft qq pair. 

One can straightforwardly extend this analysis to multiple qq pairs and multiple gluons. 
The originally soft large-angle quark can be dragged further and further towards the hard 
jet in ensembles with multiple gluons of similar fct's but successively larger (but not strongly 
ordered) energies. However, given any fixed number of recombinations, in the soft limit 
the resulting quark-like object always has energy -C Q and will recombine with the soft 
antiquark rather than with the hard particles. 

A final comment concerns hadron-hadron collisions. There, the beam jets have a hard- 
ness ktB(v)) "which is of the same order of magnitude as any hard final-state jets that might 
have been emitted at the rapidity 77. Therefore there is no difference from the point of 
view of IRC safety between recombination into final-state jets and into beam jets and all 
the arguments given here apply equally well in the hadron-hadron context. 
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